Independently Inducible System of Gene Expression

ABSTRACT

The present invention is directed to the improved methods for the temporal induction of proteins using the condensed single protein production (cSPP) system.

This application claims priority to U.S. Provisional Application Ser.No. 61/195,139 filed Oct. 4, 2008 and 61/211,605 filed Mar. 31, 2009,the disclosures of which are hereby incorporated by reference in theirentireties.

BACKGROUND OF THE INVENTION

The production of high quality isotope-enriched protein samples is aprerequisite for applying modern NMR methods for protein structuredetermination. Although NMR spectroscopy is well suited for rapidsemi-automated structure determination of small proteins (MW<15 kDa),solving structures of larger proteins and multimeric complexes isconsiderably more challenging. As the size of the molecule increases, sodoes the molecule's rotational correlation time and, consequently, theefficiency of ¹H-¹H relaxation mechanisms. One way to suppress theseeffects is to incorporate deuterium into the protein sample, dilutingthe ¹H-¹H relaxation networks and increasing ¹³C and ¹⁵N relaxationtimes, resulting in sharper line widths for ¹³C, ¹⁵N and remaining ¹Hnuclei, and dramatically improved signal-to-noise ratios. Perdeuterationis generally required for studies of larger proteins, particularlymembrane proteins. ²H,¹³C,¹⁵N-enriched protein samples are also centralto certain strategies for fully automated analysis of small proteinstructures.

While deuterium incorporation into protein samples can greatly improvethe quality of data collected, the sample preparation itself can bechallenging. Cell growth is affected by increasing ²H₂O concentration,and cells must be gradually acclimated to high ²H₂O concentration inincremental steps. Once acclimated, additional isotopically-enrichedreagents are introduced, and protein expression can proceed. However,the overall protein yield in these conditions is often significantlyreduced. As fermentation media costs for production of uniformly²H,¹³C,¹⁵N-enriched samples range from $1500-$3,000 per liter, orhigher, perdeuteration methods are generally employed only whenabsolutely required; for this reason, many potential applications ofperdeuterated samples in routine protein NMR applications have not beenpursued.

Additionally, expression of the target gene prior to culturecondensation and resuspension in isotope enriched medium leads toheterogeneously labeled protein and results in the accumulation ofapproximately 10-20% of the total protein completely unlabeled.

Thus there remains a need for a process that is more cost effective, andresults in a higher percentage of labeled protein.

SUMMARY OF THE INVENTION

The inventors of the present application have developed a method forefficient production of perdeuterated, ¹³C-, ¹⁵N-enriched proteinsamples at a fraction of the cost of standard techniques.

To that end, in certain embodiments the present invention is directed toa vector comprising a cspA cold shock promoter; a tetR gene; a tetoperator; and a gene encoding a target protein under the control of thetet operator.

In other embodiments, the present invention is directed to a cellcontaining a vector comprising a gene encoding a target protein; and avector comprising a gene encoding an mRNA specific endoribonuclease,wherein the target protein and mRNA-specific endoribonuclease arecapable of being induced with different substances.

In yet other embodiment, the invention is directed to a method oflabeling a target protein comprising contacting a culture of cellscontaining the vectors described herein with a substance capable ofinducing the mRNA-specific endoribonuclease; and contacting a culture ofthe cells with an isotope-enriched medium comprising a substance capableof inducing the target protein. In preferred embodiments, the presentinvention provides for labeling of at least 85%, at least 90% or atleast 95% of the target protein.

Other embodiments of the present invention are directed to a proteinlabeled by the methods described herein.

Other features and advantages of the invention will be apparent from thefollowing description of the preferred embodiments thereof, and from theclaims.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 depicts a pColdI(tet) vector map;

FIG. 2 depicts the expression of a truncated C-terminal domain fromM-MuLV IN in the full panel of pCold(tet) vectors wherein M, proteinmarker; C, negative control (pColdIV(tet), no aTc induction); I,pColdI(tet); II, pColdII(tet); III, pColdIII(tet); IV, pColdIV(tet); X,pColdX(tet); TEV, pColdTEV(tet);

FIG. 3 depicts a comparison of protein expression and condensation ofpColdI(IPTG) and pColdI(tet) wherein cultures expressing E. coli proteinCspA were condensed 5×, 10×, 20×, 30×, 40×, and compared to uncondensed(1×) cultures in the (A), IPTG inducible pColdI(IPTG), and (B),tetracycline inducible pColdI(tet) vectors;

FIG. 4 depicts an isotope incorporation in the co-inducible vs. dualinducible cSPP systems wherein Panels A and B presents a schematic ofthe isotope incorporation between (A) the co-inducible pCold(IPTG)vectors, and (B) the dual inducible pCold(tet) vectors. Grey shadingindicates protein produced in unlabeled medium; blue arrows indicatetime of IPTG induction; black arrow indicates time ofanhydrotetracycline (aTc) induction. Panels C and D depictrepresentative trypsin fragment of CspA protein expressed at 40×condensation in ¹⁵N-labeled minimal medium. Predicted weight ofunlabeled and fully labeled, double charged peptide is 621.00 and 626.70Daltons, respectively;

FIG. 5 depicts a SDS-PAGE analysis of various proteins expressed incondensed-phase SPP system, comparing expression in H₂O and ²H₂O (a-j).The names of each protein, degree of condensation, and solvent (H₂O or²H₂O) are indicated in each panel. Panels a-d illustrate 4 E. colimembrane proteins expressed using the cSPP(IPTG) system. Panel a, OmpA;panel b, OmpX; panel c, YaiZ, panel d, YnaJ. Panels e-f illustrate twosoluble proteins expressed using the cSPP(IPTG) system. Panel e,calomodulin (CaM), (eukaryotic); panel f, EnvZB (E. coli). Panels g-jillustrates four additional eukaryotic (HR 3461D) and viral (E1B19K,CTD, NTD) proteins expressed with the cSPP(tet) system. Panel g, domainof human proto-oncogene tyrosine protein kinase FER, residues 453-557(HR3461D); panel h, C-terminal domain of murine leukemia virusintegrase, residues 287-381 (IN CTD); panel i, adenovirus E1B19K; panelj, N-terminal domain of murine leukemia virus integrase, residues 9-105(IN NTD);

FIG. 6 depicts deuterium incorporation with protonated ILV(δ) methylgroups using IPTG-induced system. LTQ linear ion trap mass spectrometrydata documenting extensive ²H incorporation using the IPTG-induced cSPPsystem. (a) CspA trypsin fragment SLDEGQKVSFTIESGAK (SEQ ID NO: 1)(Z=+2). The unlabeled peak (˜10% of the sample) appears at 898.5 massunits. (b) EnvZB trypsin fragment TISGTGLGLAIVQR (SEQ ID NO: 2) (Z=+2).Unlabeled peak (˜20% of the sample) appears at 693.4 mass units. Similarresults were obtained using other peptide fragments. Despite theproduction of some unlabeled material, the IPTG-induced system provides˜85% of target protein with an average enrichment of 85% with ²H, ¹⁵N,¹³C, and ¹H-methyls;

FIG. 7 depicts deuterium incorporation with protonated ILV(δ) methylgroups using dual IPTG/Tet-induced system. LTQ linear ion trap massspectrometry data documenting extensive ²H incorporation using the dualIPTG/Tet-induced cSPP system, with MazF under control of IPTG and CspAunder control of anhydrotetracycline. The dual control system exhibitsless production of unlabeled target protein, demonstrated here with datafor CspA trypsin fragment SLDEGQKVSFTIESGAK (SEQ ID NO: 1) (z=+3). Theunlabeled peak at 599.8 mass units represents <3% of the sample. Themass distribution of labeled species indicate a range of enrichedspecies of 70-100%, with an average enrichment of 85% with ²H, ¹⁵N, ¹³C,and ¹H-methyls. Similar results were obtained using other peptidefragments;

FIG. 8 depicts deuterium incorporation using dual IPTG/Tet-inducedsystem. LTQ linear ion trap mass spectrometry data documenting extensive²H incorporation in ²H₂O media without isoleucine, leucine, and valineprecursor incorporation, using the dual IPTG/Tet-induced cSPP system.The tryptic fragment of the M-MLV Integrase N-terminal domain (NTD),SHSPYYM_((oxidation))LNR (SEQ ID NO: 3) (z=+3), displayed heredemonstrates the efficiency of incorporation of deuterium isotope. Theunlabeled peak at 428.7 mass units contributes minimally to the overallyield, and the mass distribution observed indicates species with a rangeof 85-100% deuterium incorporation (predicted mass for completeperdeuteration is 448.3 mass units);

FIG. 9 depicts [¹³C-¹H]-HSQC spectra of ¹H-IL(δ1)V, ²H, ¹³C,¹⁵N-enriched (a) CspA and (b) EnvZB produced with the IPTG inducedsystem. Peaks are present in the region of the spectrum where methylsignals appear, but little or no ¹H-¹³C signal is detected for the restof the protein. This method does not detect the fraction of proteinmolecules (˜10%) with ¹H-¹²C isotopic composition. Inset: Expandedmethyl region of (a), with sequence-specific resonance assignments;

FIG. 10 depicts [¹⁵N-¹H^(N)]-HSQC spectrum of CspA. Spectrum collectedat pH 6.0 and 20° C. of sample of [¹H-¹³C]-I(δ1)LV, ¹³C, ¹⁵N,²H-enriched CspA, produced with the IPTG-induced system, and recorded onan 800 MHz NMR spectrometer. Assigned backbone H^(N) resonances arelabeled by one-letter amino acid code followed by their sequencenumbers;

FIG. 11 depicts summary of backbone and ¹³C^(β) resonances assignmentsfor CspA derived from triple resonance NMR experiments. Red bars andyellow bars underneath the amino acid sequence represent theconnectivity established between intra and sequential residuesrespectively. These data were obtained by analyzing six 2D and 3D NMRspectra, summarized in Table S1. Slowly exchanging backbone amides, usedin the conventional structure analysis but not in the CS-Rosettaanalysis, identified by ¹H/²H exchange measurements, are represented byfilled circles. Secondary structures of the β-barrel found in the finalstructure are indicated by arrows along the amino acid sequence.

FIG. 12 depicts a stereo-view of the superimposition ofAutoStructure-CNS structure for [¹H-¹³C]-I(δ1)LV, ¹³C, ¹⁵N, ²H-enrichedCspA determined by conventional automated analysis methods (blue) withthe 2.0 Å X-ray crystal structure of CspA (red) (pdb ID: 1mjc). (a)Backbone line representations of the 10 lowest energy conformersobtained from AutoStructure-CNS structure compared with X-ray crystalstructure. (b) Ribbon diagram of the lowest energy conformer ofAutoStructure-CNS structure versus X-ray crystal structure. (c) Thepacking of the hydrophobic residues (viz, V9, I21, V30, V32, I37, L45,V51, F53, A64, and V67) for the lowest energy conformer of AutoStructure-CNS structure versus X-ray crystal structure;

FIG. 13 depicts SDS-PAGE analysis of OmpX preparations. OmpX wasexpressed from pColdIV(SP4) containing the ACA-less OmpX gene along withMazF from pACYCmazF as described above. Lane M: molecular weight marker.PM+OM: whole membrane fraction including plasma membrane and outermembrane; OM: outer membrane fraction, which was used to collect the NMRspectrum shown in FIG. 3 b. The position of OmpX is indicated by anarrowhead;

FIG. 14 depicts a coomassie stained SDS-PAGE analysis of proteinexpression for CspA (a) and EnvZB (b) in E. coli in minimal medium.Lanes 2-5 contain the whole cell extract equivalent to 100 μL ofuncondensed culture. Lane 1, molecular weight marker (M); lane 2,uncondensed expression in H₂O; lane 3, 40-fold condensed in H₂O; lane 4,40-fold condensed in ²H₂O; lane 5, 40-fold condensed control cells(without pCold I expression vector). Arrows mark CspA and EnvZB proteinband accordingly. These data demonstrate cost effective production of²H-enriched proteins in condensed cultures, without acclimation of E.coli cells, using only 2.5% of media costs compared to a conventionalfermentation to obtain similar protein yields; and

FIG. 15 depicts NMR spectra of E. coli membrane proteins YaiZ and OmpX.(a) 800 MHz TROSY-[H^(N)-¹⁵N]-HSQC NMR spectrum of uniformly²H,¹³C,¹⁵N-enriched YaiZ obtained at 40° C. following simple detergentextraction. (b) 600 MHz TROSY-[H^(N),¹⁵N]-HSQC NMR spectrum at 50° C. ofuniformly ²H,¹³C,¹⁵N-enriched OmpX following simple detergentextraction. Target proteins (YaiZ or OmpX) were selectivelyisotope-enriched and perdeuterated using the cSPP system. The NMR sampleof YaiZ was prepared by simple detergent solubilization of the innermembrane fraction by 25 mM MES buffer, pH 6.0, containing 10% LOPG in95% H₂O/5% ²H₂O. The NMR sample of OmpX by simple detergent extractionof the outer membrane fraction by 20 mM potassium phosphate buffer, pH6.4, containing 5% DPC in 95% H₂O/5% ²H₂O.

DETAILED DESCRIPTION

The inventors of the present application have discovered that byoptimizing the previously described condensed single protein production(cSPP) method of protein production in E. coli, high levels of specificproteins can be produced in deuterium-based, isotope-enriched minimalmedium.

The cSPP method utilizes a bacterial toxin, the endoribonuclease MazFthat specifically cleaves mRNAs at ACA sequences, forcing the cells intoa semi-dormant state. By engineering a target gene to be devoid of ACAsequences, and therefore resistant to MazF cleavage, it is possible tonot only selectively express and isotope-enrich a targeted protein, butto do so in a dramatically reduced (condensed) culture volume. In thiswork, it was discovered that the cSPP method requires no acclimationsteps for protein expression in deuterated media, and allows as much as40-fold (or higher) condensation of the culture prior to induction oftarget protein expression with no significant reduction in protein yieldper cell. Using such perdeuterated samples of the E. coli cold shockprotein A (CspA), rapid (˜3.5 day) NMR data collection, fully automatedanalysis of resonance assignments, and high quality 3D structuredetermination without the need for side chain resonance assignmentsusing the CS-Rosetta method were demonstrated. The combined methods ofinexpensive perdeuteration and CS-Rosetta, provide the basis for ahigh-throughput approach for routine, rapid, high-quality structuredetermination of small proteins.

Also, the cSPP system can be used for cost-effective production of²H,¹³C,¹⁵N-enriched membrane proteins, including E. coli plasma membraneprotein YaiZ and outer membrane protein X (OmpX). Protein perdeuterationapproaches have tremendous value in protein NMR studies, but are limitedby the high cost of perdeuterated media. However, it has been discoveredthat the condensed Single Protein Production (cSPP) method can be usedto provide high-quality ²H,¹³C,¹⁵N-enriched protein samples at 2.5-10%the cost of traditional methods. As an example of the value of suchinexpensively-produced perdeuterated proteins, ²H,¹³C,¹⁵N-enriched E.coli cold shock protein A (CspA) were produced in 40× condensed phasemedia, and a high-accuracy 3D structure using CS-Rosetta was determined.The cSPP system was also used to produce ²H,¹³C,¹⁵N-enriched E. coliplasma membrane protein YaiZ and outer membrane protein X (OmpX) incondensed phase. NMR spectra can be obtained for these membrane proteinsfollowing simple detergent extraction, without extensive purification orreconstitution, allowing a membrane protein's structural and functionalproperties to be characterized prior to reconstitution, or as a probe ofthe effects of subsequent purification steps on the structural integrityof membrane proteins. The 10-40 fold reduction in costs of fermentationmedia provided by using the cSPP system opens the door to many newapplications for perdeuterated proteins in spectroscopic andcrystallographic studies.

One weakness of the available set of cSPP vectors, however, was theaccumulation of unlabeled protein due to expression prior to culturecondensation. Specifically, although the previously described cSPPsystem allowed for protein production and isotope enrichment at afraction of the cost incurred with traditional methods, a substantialamount (10-20%) of the target protein remained unlabeled due to theco-induction of the target protein for 2 hr with the MazF toxin uponaddition of IPTG and prior to culture condensation. While in many NMRapplications such as ¹H-¹⁵N detected triple resonance experiments, thisunenriched population of protein is not visible, it contributeseffectively to lowering the yield of isotope-enriched species andreducing the effective signal-to-noise of the NMR experiment. Theseunenriched species in the NMR sample can also create artifacts andcomplicate interpretation of X-filtered NMR experiments.

It has now been found that in order to remedy this problem, theinduction of the target protein can be temporally regulated by placingthe pCold vectors under control of the tet operator. A toolbox ofvectors has been developed herein for expression in the cSPP systemwhere temporal regulation of the ACA-less target gene results in greaterthan 95% isotope enrichment and a reduction in isotopic heterogeneity.This has direct applications to protein NMR for structural genomics andstructural biology studies, specifically in applications requiring theproduction of isotope-enriched proteins.

By temporally separating the induction of the MazF toxin from that ofthe target protein it is now possible to nearly eliminate contaminationfrom the unlabeled target protein and obtain a protein product that ismuch more homogeneous with respect to isotopic-enrichment. This iscritical for certain applications in which labeled species A is mixedwith unlabeled species B, and NMR is used to detect interactions betweenspecies A and species B as an interaction between a labeled and anunlabeled species; this application fails if, for example, there is asignificant component of unlabeled molecules in the sample of labeledspecies A. In addition to the series of pCold(tet) vectors in the tablebelow, two additional pCold(tet) vectors have been constructed.pColdVb(tet) and pColdVbHis(tet) encode an N-terminal fusion tagcomprising the OmpA signal peptide resulting in peptide secretion. Thesignal peptide is cleaved by a cellular protease (signal peptidase I)during secretion. pColdVbHis(tet) additionally encodes a His₆ fusiontag, that remains post protease cleavage.

Similar to the pCold(tet) vectors in the table below, the pColdVb(tet)and pColdVbHis(tet) vectors drive target gene expression from the cspApromoter and are regulated by the tetO₁ operator.

Table 1 highlights the various features of the pCold(tet) vectors

Vector TEE His Tag Tag sequence Protease kDa tag pColdI(tet) + +MNHKVHHHHHHIEGR/HM (SEQ ID NO: 4) Xa 2.04 kDa pColdII(tet) + +MNHKVHHHHHHM (SEQ ID NO: 5) — 1.45 kDa pColdIII(tet) + −MNHKVHM (SEQ ID NO: 6) — 0.76 kDa pColdIV(tet) − − — — — pColdX(tet) − +MGHHHHHHSHM (SEQ ID NO: 7) — 1.25 kDa pColdTEV(tet) + +MNHKVHHHHHHSSGRENLYFQ/GHM TEV 2.83 kDa (SEQ ID NO: 8) pColdVb(tet) − −MKKTAIAIAVALAGFATVAQA/HM Cellular 2.31 kDa (SEQ ID NO: 9)pColdVbHis(tet) − + MKKTAIAIAVALAGFATVAQA/HHHHHHM Cellular 3.00 kDa(SEQ ID NO: 10)

Using this set of vectors, it is now possible to test several differentconstructs for optimal protein expression in the cSPP(tet) system. Theresults presented below show two proteins, one bacterial (CspA) that canbe condensed 40×, and one viral (M-MuLV IN C-terminal domain) that canbe condensed 10×. Additional experiments have allowed the N-terminal Znbinding domain of M-MuLV IN to be similarly expressed, condensed 15×,and selectively labeled with amino acid precursors. Thus the generalityof this vectors system is firmly established.

It is important to note that tetracycline addition is toxic to bacteriaat concentrations at or higher than 0.5 μg/ml. In order to maximizeinduction of the target proteins, modified tetracycline derivatives areutilized, which display lower toxicity and thus can be present in higherconcentrations (e.g. anhydrotetracycline, 2 μg/ml). In the SPP systemdescribed herein, to achieve maximal induction of target gene expressionof uncondensed cultures, concentrations of tetracycline approaching thetoxicity limit are required. For this reason anhydrotetracycline isutilized at a concentration of 0.2 μg/mL. When the cell culture iscondensed to greater densities the concentration of anhydrotetracyclinemust also be increased in a linear fashion to compensate for theincreased cell number and active transport of anhydrotetracycline acrossthe cell membrane. Generally the degree of condensation is multiplied by0.15 μg/mL of anhydrotetracycline to give the final concentration ofdrug required for optimal induction. In this respect a 1× (uncondensed)culture is induced at 0.2 μg/mL of anhydrotetracycline while a 10×culture would be induced at 1.5 μg/mL, and similarly a 40× culture wouldbe induced at a final concentration of 6.0 μg/mL of anhydrotetracycline.It should be noted that in this fashion the known toxicity limit foranhydrotetracycline of 2.0 μg/mL can be exceeded with no toxic effects.

The maintenance of MazF under the IPTG induction and placing the targetprotein under tetracycline analog regulation also limits the toxicityassociated with tetracycline because the variability resulting frominducing MazF under tetracycline is eliminated, which creates additionalstress to cells. Thus, under condensed cell cultures, the toxicity limitof anhydrous Tet can exceed the reported levels. This maximizes theinduction levels of the target proteins.

While it has been effectively shown that a tetracycline based approachto gene expression control has been successful, the dual inductionapproach is not limited to tetracycline and alternative inductionmethods can similarly be used.

Replacing the tet repressor gene, tetR, in the pCold(tet) vectors withthe trp repressor gene, trpR, and likewise replacing the tet operator,tet O₁, with the trp operator sequence would allow for target geneinduction upon addition of the inducer, 3-β-indoleacrylic acid (IAA).

Additional examples of dual inducible systems for use in cSPP includeregulating MazF expression with L-arabinose induction by introducing theO₁ and O₂ operator and I₁ and I₂ initiation regions upstream of mazF andincluding the transcription regulator araC gene on the plasmid. Thiswould provide similar induction conditions to that of the pBAD vectorseries, allowing for target gene induction to be temporally regulatedwith either of the IPTG or tetracycline inducible versions of the pColdplasmids currently available.

Similar to regulating the mazF gene by L-arabinose induction, thepreviously described pRHA vector utilizing L-rhamnose for inductionwould have the added benefit of allowing for “titratable” levels of MazFproduction. This added feature would allow for various othercombinations of target gene induction, including pET vectors, as theamount of MazF produced at higher temperatures could be more tightlyregulated. Using an all or none induction system, such as the lacoperon, can be problematic when expressing MazF at the highertemperatures required for efficient target protein production inconventional expression systems such as pET, as MazF itself becomestoxic to the cells when expressed at high levels. The ability to titrateMazF production would open the door to various additional combinationsof dual inducible scenarios.

In theory, any possible combination of inducible systems for proteinproduction, including regulation by IPTG, tetracycline (e.g.,anhydrotetracycline or alternatively with various other tetracyclineanalogues (for example, epi-anhydrocholoro-tetracylcine), L-arabinose,and L-rhamnose could be used to create a set of plasmids suitable foruse in the dual inducible cSPP system.

An embodiment of the dual induction system has been developed andverified with three independent bacterial (CspA) and viral proteins(MuLV Integrase N-terminal domain, Adenovirus E1B19K). Representativeexamples of protein expression levels at varying degrees of condensationwith the described tetracycline inducible SPP system as compared to thatof the established co-inducible SPP system are shown in FIGS. 3 and 5.Also shown in FIG. 4 are comparisons of ¹⁵N-isotope incorporation intothe bacterial target protein CspA as measured by mass spectrometry ofrepresentative trypsin fragments.

The dual induction system has direct applications to systems requiringthe high efficiency incorporation of labels into target protein. Underthe previous system, target proteins are co-induced with MazF. One majorbenefit of the SPP system is that upon MazF induction and host proteinshut down, the bacterial cells can be condensed and grown in a smallvolume. With the single induction system, the target proteins will stillbe expressed prior to the host-protein shut-down and condensation. Theuse of the condensed system allows for minimal use of label, tracerelements, and modified amino acids. By separating the induction systems,the cultures can be condensed prior to the addition of the tracerelements and prior to the separate induction of the target proteins.Thus the target protein will only be expressed in the presence of thelabels.

The invention of an independently inducible single protein production(SPP) system greatly improves the quantity and quality of isotopelabeled product for nuclear magnetic resonance (NMR) applications aswell as various other practices requiring specifically labeled proteinproducts as compared to the existing co-inducible SPP system.

Direct examples are the use in NMR structural studies. Structuralstudies of large proteins, often require the incorporation of ²H₂0, ¹⁵Nand ¹³C labels. Using the single IPTG induction system, isotope-labeledmarkers were at best incorporated to 80% efficiency due to the threehour period of co-induction with MazF. This efficiency has increased toover 97% using the dual induction system, as judged by mass spectrometryof multiple isolated peptides. Thus, a reproducible high efficiencymethod of incorporating label specifically into the targeted protein hasbeen developed. Alternative labels can use specific amino acids.

An additional application possible with the advent of the dual induciblecSPP system, not feasible with the co-inducible cSPP system is theproduction of asymmetrically labeled samples for NMR spectroscopy.Characterizing oligomer interface interactions can be challenging whendetermining protein structures by NMR as it is difficult to distinguishinter- from intra-molecular interactions. By producing an asymmetricallylabeled protein sample, where one subunit in the complex is labeleddifferently from that of the other(s), it is possible to distinguishinter- from intra-molecular contacts.

Asymmetrically labeled samples such as those described for proteininterface interactions above require that little to no unlabeled proteinbe present in the final sample. One strategy is to uniformly label thetarget protein with deuterium while supplying protonated forms of thecompounds of interest. Two separate cultures could be prepared where onewould produce protein specifically protonated at the amino acids ILV andthe other would produce protein specifically protonated at the aminoacids FYW. The cultures would then be combined in the purificationprocess only after protein expression was complete in order to generatea mixed multimer. Assuming complete mixing, a ratio of 1:2:1 of mixedmultimers would be expected, where 50% of the total protein contains anasymmetric interface. As the co-inducible cSPP system generates a highlevel of background, which would in this case be protonated, the abovedescribed experiment would not be feasible. The dual inducible cSPPsystem, by reducing the background protein expression prior to isotopeenrichment, opens the door to this approach and similar approaches thatrequire minimal to no background of unlabeled protein present.

Combining dual induction with the already established cSPP system offersthe benefit of increased yield of isotope enriched protein and asignificantly improved signal to noise ratios in NMR studies. The dualinducible cSPP(tet) system extends the established cSPP system tovarious applications such as 2D NMR on perdeuterated proteins and othersystems that would not be possible using the co-induced cSPP system. Thecondensability of the culture using these systems opens the door to manyapplications in structural and functional genomics in which high proteinyields are required in small sample volumes, including microtiter platefermentation methods. It is expected that the dual inducible expressionsystems described here, allowing condensation of 10-40 fold infermentations for isotopic-enrichment, will become the default systemsused for producing isotope-enriched proteins in E. coli for NMR, massspectrometry, neutron diffraction, and a wide range of otherapplications in molecular biophysics and biotechnology.

EXAMPLES Example 1 Materials and Methods

Vector cloning. The complete sequences of the pCold(tet) vectorsincluding pColdI(tet), pColdII(tet), pColdIII(tet), pColdIV(tet),pColdX(tet), and pColdTEV(tet), have been deposited in Genbank.Beginning with pColdI(SP-4) as the parental vector pColdI(tet) wasgenerated in the following manner: The lacI gene and upstream region, upto but not including the cspA promoter, was replaced with the tetR geneand a modified version of the promoter-operator region of tetA and tetRfrom Tn10. The tetA and tetR promoter-operator region was clonedupstream of the existing cspA cold shock promoter in order to avoiddisrupting transcription of the target gene by the cold shock promoter.In addition, the DNA sequence of the lac operator region within the5′UTR of cspA controlling induction of the target gene was mutated tothat of the tet operator tetO₁. The Tn10 promoter-operator regionconsists of two tet repressor binding sites, tetO₁ and tetO₂ withoverlapping promoters driving transcription in opposite directions. Onlytranscription of the tetR gene was required therefore the −10 and −35elements of the tetA promoter were mutated to avoid potentialinterference with the adjacent cspA promoter driving target geneexpression. Generation of pCold II, III, and IV(tet) vectors proceededby cloning the region upstream of the multiple cloning site between NheIand NdeI, containing the various fusion tag sequences from the pColdseries into the pColdI(tet) backbone. pColdX(tet) and pColdTEV(tet)inserts were generated by synthesizing the appropriate fusion tagsequences via overlapping PCR and cloning the resulting DNA product intothe pColdI(tet) vector between NheI and NdeI.

An ACA-less cspA gene, as previously described in Suzuki, M et al (2005)“Single protein production in living cells facilitated by an mRNAinterferase.” Mol Cell 18, 253-261, incorporated herein by reference inits entirety, was sub-cloned into the pColdI(tet) vector between theNdeI and BamHI restriction sites for protein expression. A truncatedC-terminal ACA-less gene construct from Thr287-Gly381 of the MoloneyMurine Leukemia Virus (M-MuLV) Integrase (IN) protein was codonoptimized for E. coli and synthesized by overlapping PCR. The resultingDNA product was then sub-cloned into the Nde I and BamH I restrictionsites of the pCold(tet) series of vectors for protein expression.

Protein expression. pCold(tet) vectors containing the desired insertwere transformed into chemically competent BL21(DE3) cells containingthe vector pACYCmazF, as previously described in Suzuki, M. et al.(2007) “Single protein production (SPP) system in Escherichia coli.” NatProtoc 2, 1802-1810, incorporated herein by reference in its entirety.Single colonies were picked and inoculated into 2 ml of M9CAA containing30 μg/ml chloramphenicol for selection of pACYCmazF and 100 μg/mlcarbenicillin for selection of pCold(tet), and grown overnight (20 hr).The following day, M9 medium containing 30 μg/ml chloramphenicol and 100μg/ml carbenicillin was inoculated at 5 μl/ml and grown overnight (20hr). Fresh M9 medium containing 30 μg/ml chloramphenicol and 100 μg/mlcarbenicillin was then inoculated at 10% with the overnight culture andgrown at 37° C. until the OD₆₀₀ 0.5 and then the cultures were rapidlycooled while swirling in an ice bath for 10-15 mM. The MazF protein wasinduced with 1 mM IPTG and cultures were incubated at 15° C. for anadditional 2 hr while shaking. Cell cultures are then centrifuged for 10min at 4° C. at 5000×g and resuspended in the desired condensed culturevolume of M9 medium or ¹⁵N-enriched M9 medium (in which ¹⁵NH₄Cl is thesole source of nitrogen) containing 30 μg/ml chloramphenicol, 100 μg/mlcarbenicillin, 1 mM IPTG, and anhydrotetracycline. The level ofanhydrotetracycline varied with the level of condensation; 0.2 μg/ml for1× cultures, 0.75 μg/ml for 5× cultures, 1.5 μg/ml for 10× cultures, 3.0μg/ml for 20× cultures, 4.5 μg/ml for 30× cultures, and 6.0 μg/ml for40× cultures. To directly compare protein expression between differentvectors at various degrees of culture condensation, each lane was loadedwith the equivalent of 100 μl of uncondensed culture centrifuged andresuspended in lysis buffer (10 mM sodium phosphate pH 7.2; 1%β-mercaptoethanol; 1% SDS; 6 M urea).Protein purification and mass spectrometry. 1 ml of 40-fold condensed,¹⁵N-enriched cultures containing either the expression plasmidpColdI(tet) cspA or pColdI(IPTG) cspA was centrifuged at 10,000×g andresuspended in 1.0 ml of lysis buffer (10 mM sodium phosphate pH 7.2; 1%(3-mercaptoethanol; 1% SDS; 6 M urea). Cellular debris was pelleted bycentrifugation at 12,000×g for 30 min and supernatant containing theHis₆ tagged CspA protein was purified by binding the cellular lysate to40 μl of Ni-NTA agarose resin. Ni-NTA resin was washed twice with 1 mlof wash buffer (50 mM Na₂HPO₄—NaH₂PO₄; 300 mM NaCl; 50 mM imidazole; 5mM β-mercaptoethanol, pH 8.0) and eluted in 100 μl of elution buffer (50mM Na₂HPO₄—NaH₂PO₄; 300 mM NaCl; 250 mM imidazole; 5 mMβ-mercaptoethanol, pH 8.0) Purified protein samples were run on a 15%SDS-PAG followed by Coomassie Blue staining and the CspA protein bandwas excised. In-gel digest with trypsin was performed in 50 mM NH₄HCO₃,pH 7.9 overnight. Peptides were extracted from the gel with 60%acetonitrile, 5% formic acid, and lyophilized. Samples were thensolubilized in 0.1% trifluoroacetic acid (TFA), pH 2-5-3.0 prior toLC-MS/MS mass spectrometry. Chromatography was conducted using anultimate nano-LC system (Dionex/LC Packings) and a fritless nanoscalecolumn (75 μm×15 cm) packed in-house with 3 μm, 200 A pore size MagicC18 stationary phase (Michrom Bioresource, Auburn, Calif.). The columnwas equilibrated in 0.1% formic acid (Solvent A), and samples wereeluted using a linear gradient from 2% to 45% solvent B (0.1% formicacid in acetonitrile) over 30 min at a flow rate of 250 nl/min andanalyzed by an LTQ linear ion trap mass spectrometer (ThermoFinnigan,San Jose, Calif.) equipped with a nanospray source (Proxeon Biosystems).

To select peptides suitable for monitoring, tryptic peptides wereanalyzed by MS/Zoom scan/MSMS scan; the peptide GFGFITPDDGSK (SEQ ID NO:11) was selected for further analysis. For quantitation of isotopeincorporation, the analysis was repeated with MS in profile mode. Thepopulation of peaks corresponding to the isotope labeled peptides (basedon retention time and MSMS information) were integrated for peak area.

Results

Tetracycline inducible pCold vectors. A series of pCold vectors weregenerated, separating the IPTG induction of MazF from that of theACA-less target gene. See FIG. 1 wherein Vector features arehighlighted. (A), pColdI(tet) vector map where the region shown in boldcontaining TEE (translation enhancing element), His Tag, and Factor Xavary among the different pCold(tet) vectors. M13 IG, intergenic regionof M13 bacteriophage; ColE1, colicinogenic factor E₁ for plasmidreplication. (B), expanded tetO₁ and tetO₂ region depicting thetetA-tetR overlapping promoter-operator. The mutated bases within the−35 and −10 element of the tetA promoter are highlighted in reditalicized font. (C), expanded multiple cloning site including suggestedsequencing primers and fusion tag; S.D, Shine Delgarno sequence. (*these restriction sites are duplicated in the multiple cloning regionbut can be used for 3′ end cloning. Xba I is also present in the tetRgene and therefore is not recommended for cloning.)

FIG. 1A graphically describes the various features of pColdI(tet). Inthe new pCold(tet) vectors, the lad gene that is responsible forrepressing transcription from the lac operator was replaced with thetetR gene. Expression of the tetR gene is controlled by thepromoter-operator region of Tn10 that is shared with and overlaps withthat of the oppositely oriented tetA gene (FIG. 1B). Modifications weretherefore made to the tetR/tetA promoter-operator region in an effort toprevent RNA Polymerase binding and transcription from the tetA promoter.While leaving the −10 and −35 elements from the tetR gene intact, the−10 and −35 elements of the tetA gene, 5′-TTGACA-3′ and 5′-TATTTT-3′respectively, were mutated away from consensus to 5′-GCTCTA-3′ andCGCGTT-3′ respectively without disrupting the neighboring tetO₁ andtetO₂ operator sequences (FIG. 1B). In order to place the cspApromoter-driven target gene under the control of the tet operator, thelac operator sequence within the cspA 5′ UTR was replaced with the tetoperator sequence to allow for tet repressor binding to tetO₁ (FIG. 1A).The resulting changes successfully allowed for the independent inductionof MazF (lacI^(R)) from that of the target protein (tetR^(R)). FIG. 1Cdescribes the multiple cloning site. Cloning of an ACA-less synthesizedgene is readily obtained through the 5′ Nde I site followed by insertionof an in-frame stop codon. Several enzymes within the multiple cloningsite are duplicated. The 3′ terminus of the target gene can utilize theBamH I site or alternative restriction enzyme sites localized in themultiple cloning site.

pCold(tet) vector series. A series of pCold(tet) vectors have beengenerated with various features for protein expression. As described inTable 1, each vector provides a unique combination of fusion tags forprotein expression and purification purposes. The translationalenhancing element (TEE) encoding the amino acids MNHKV (SEQ ID NO: 12),featured in all but one of these vectors [pCold IV(tet)], has previouslybeen shown to increase translational efficiency in cold shock mRNAs.His₆ tags are included for purification purposes, where pColdI(tet)encodes a Factor Xa protease cleavage site and pColdTEV(tet) encodes aTEV protease cleavage site. While proteolytic removal of the His₆ tag byFactor Xa from protein expressed in pColdI(tet) leaves an N-terminalHis-Met as opposed to the three residues (Gly-His-Met) remaining fromTEV cleavage of proteins expressed from pColdTEV(tet), the TEV proteaseefficiently cleaves proteins in a wide range of buffer conditions. Incontrast, Factor Xa cleavage requires a specific buffer containingcalcium that may decrease the solubility of the target protein. Itshould be noted that any desired fusion tag can be added in frame to thetarget gene during gene synthesis and cloned into the pColdIV(tet)vector. Together, the vectors described in Table 1 provide a range ofoptions for protein expression and purification purposes in the cSPPsystem. As described in Table 1, the various fusion tags encoded in allbut the pColdIV(tet) vector will alter the predicted size of the proteinproduct between 0.76 kDa and 2.83 kDa depending on the vector.

pCold(tet) vector expression. To compare protein expression from thefull series of pCold(tet) vectors, an ACA-less gene construct encoding atruncated version of the C-terminal domain of the M-MuLV IN protein wascloned into the multiple cloning site of all six pCold(tet) vectors.FIG. 2 confirms that the ACA-less gene product is in fact produced inall six vectors with slight variation in the level of expression. Thegel shown in FIG. 2 is representative of several independent inductionswhere size differences arise from the various fusion tags described inTable 1. Slight variations in culture density can be observed betweencultures in FIG. 2 by observing background staining leading to thevariability of intensity in the target protein band between samples.Comparison of expression and culture condensation in pCold(IPTG) andpCold(tet) vectors. To determine whether proteins expressed frompCold(tet) vectors not only maintained the ability to undergo culturecondensation similar to the original IPTG inducible pCold(IPTG) vector,but were improved regarding homogeneity of isotope enrichment (¹⁵N), aside by side comparison of expression and culture condensation wascarried out. For comparison, the bacterial cold shock cspA gene productwas expressed as an ACA-less cassette in the pColdI vector backboneunder either IPTG (pColdI(IPTG)) or anhydrotetracycline (pColdI(tet))induction. Upon reaching the correct cell density, the IPTG inducibleMazF toxin was expressed for 2 hr prior to culture condensation to allowfor MazF mediated degradation of cellular mRNAs and cell growth arrest.After 2 hr of MazF induction, cells were centrifuged and resuspended invarious volumes of ¹⁵N-enriched M9 medium. FIGS. 3 (A and B) comparesthe various condensed states of protein expression resulting from theIPTG inducible (A) and tet inducible (B) pCold vectors. In both cases,little difference in expression is observed from uncondensed (1×) to40-fold (40×) condensation. It can be concluded that expression from thepColdI(tet) vector is comparable to expression from the pColdI(IPTG)vector.Comparing isotope incorporation in pCold (IPTG) and pCold (tet) vectors.FIGS. 4 (A and B) shows a graphical representation of the time course ofinduction and isotope incorporation. For the IPTG induced system, theaddition of IPTG required to induce MazF expression prior to culturecondensation and introduction of isotope enriched medium leads to asubstantial amount of unwanted expression from the pCold(IPTG) vectors.The grey shaded region highlights the period where IPTG induced proteinexpression occurs in the absence of culture condensation andintroduction of isotope enriched medium. The dual IPTG/tet inducedsystem is predicted to eliminate the substantial amount of unlabeledexpression from the ACA-less target gene. To confirm the improved yieldof isotope enriched protein product expected with the dual inductioncSPP(tet) system, mass spectrometry was performed on 40× condensed,¹⁵N-enriched CspA protein. FIGS. 4 (C and D), displaying arepresentative tryptic fragment GFGFITPDDGSK (SEQ ID NO: 11) of CspA,highlights the dramatically improved ratio of isotope enriched tounenriched product when using the pColdI(tet) vector as compared to thepColdI(IPTG) vector. The predicted weight of the fully labeled, doublecharged peptide is 626.70 Daltons and the unlabeled peak is 621.00Daltons. Remarkably, the 621.00 mass unit peak constitutes 20% of thetotal protein fraction when using the pColdI(IPTG) vector but is nearlyabsent, representing only 1.3% of the total protein, when using thepColdI(tet) vector. Comparison of the two plots shows that the labelingof the tetracycline induced protein (FIG. 3D) is dramatically lessheterogeneous and more complete than protein expressed from the IPTGinducible pColdI vector (FIG. 3C). The fully labeled peak from thepColdI(tet) product appears at 626.81 mass units, the predicted weightfor 100% isotope incorporation, compared to the 625.81 mass unit peakfrom product produced with the pColdI(IPTG) vector. Furthermore theisotope enrichment is more complete as determined by comparing the widthof the half height of the peak. The distribution from product producedwith the pColdI(IPTG) vector ranges 2.5 mass units (representing a rangeof isotopic incorporation of 60-100%), whereas the peak from thepColdI(tet) vector product ranges only 1.5 mass units (representing arange of isotopic incorporation of 90-100%).

Example 2 Methods and Materials

The cSPP system utilizes MazF, a cellular endoribonuclease, to induce aquasi-dormant state within the cell. MazF specifically cleavessingle-stranded RNA at ACA sequences, thereby destroying cellular mRNAs,and shutting down protein synthesis. Engineering a target gene devoid ofACA triplets therefore renders transcripts resistant to MazF cleavage.Upon expression of MazF and induction of a quasi-dormant state, it ispossible to not only selectively produce a single protein in a livingcell, but to do so in highly-condensed culture conditions.General Protocol for Production of [¹H-¹³C]-I(δ1)LV, ¹³C, ¹⁵N,²H-enriched Proteins Using Anhydrotetracycline-Inducible SPP Vectors.Anhydrotetracycline-inducible pCold vectors, containing target proteinunder the control of the tet operator (tetO₁), are described inreference 1. The following protocol is used for production of²H-enriched proteins with these vectors.

-   -   (i) Pick a single colony (containing pMazF and the expression        plasmid), using a toothpick, from freshly plated transformed        cells grown overnight on MJ9-CAA medium based plates (MJ9 medium        supplemented with 0.2% casamino acids) containing 25 μg/mL        chloramphenicol and 100 μg/mL ampicillin at 37° C. and inoculate        MJ9-CAA medium for overnight growth on a shaker (at        approximately 180 r.p.m.). Throughout this protocol, the MJ9        medium, which is vitamin and buffer supplemented², can be        replaced with standard M9 medium, though better protein        expression yields are usually obtained with MJ9 medium.    -   (ii) Inoculate MJ9 medium containing 25 μg/mL chloramphenicol        and 100 μg/mL ampicillin at 5 μL/mL (CAA should NOT be added).        Grow cultures overnight at 37° C. on a shaker (at approximately        180 r.p.m.).    -   (iii) Inoculate 10% of desired culture volume from overnight        culture into fresh MJ9 medium containing 25 μg/mL        chloramphenicol and 100 μg/mL ampicillin, and incubate at 37° C.        on a shaker (at approximately 180 r.p.m.). Note: if protein is        to be labeled with specific amino acids or precursors, the same        unlabeled amino acids or precursors (see b. below) should be        added at this time to suppress the corresponding biosynthesis        pathways from glucose during cell culture growth.        -   Specifically, for preparations labeling the isopropyl methyl            groups of Leu and Val, add 100 mg/L α-ketoisovaleric acid;            for preparations labeling Ile, add 50 mg/L α-ketobutyric            acid (Ile); for preparations labeling Phe, Tyr, or Trp, add            75 mg/L shikimic acid or 50 mg/L of the individual amino            acids Phe, Tyr, and/or Trp, to suppress the corresponding            biosynthetic pathways.    -   (iv) Monitor the optical density of the culture at 600 nm. The        initial OD₆₀₀ should be approximately 0.2 after inoculation.        Make sure the culture is growing exponentially. The OD₆₀₀ of the        culture should increase linearly in a graph of log OD₆₀₀ verses        time.    -   (v) When the OD₆₀₀ reaches ˜0.5, remove the flask from the        shaker and chill the culture rapidly by shaking the flask in an        ice water bath for 10 min to reach a target temperature of 15°        C.    -   (vi) Add IPTG to a final concentration of 1 mM to induce        expression of MazF. Harvest a sample of cells from 1.5 mL of the        culture by centrifugation (12,000×g, 5 min, 4° C.) and store the        resulting pellet at −20° C. for subsequent analysis of protein        expression levels by SDS-PAGE    -   (vii) Continue the culture at 15° C. with shaking for 2 more        hours. This 2 hour pre-incubation step before culture        condensation and isotope labeling is important to prevent        background isotope incorporation and prepare the cells for        subsequent condensation.    -   (viii) Centrifuge (5000×g, 10 min, 4° C.) the culture to collect        the cells and resuspend the cell pellet at 2.5% the initial        culture volume (40×) in chilled 100 mM phosphate buffer, pH 7.5,        in ²H₂O to wash the cells. Repeat centrifugation step to        re-pellet the cells.    -   (ix) Resuspend the cell pellet in the desired culture volume of        MJ9 medium prepared in ²H₂O, and containing specifically labeled        amino acids or precursors together with 25 μg/mL        chloramphenicol, 100 μg/mL ampicillin, 1 mM IPTG and        anhydrotetracycline (see b. below for anhydrotetracycline        concentration). For production of [¹H-¹³C]-I(δ1)LV, ¹³C, ¹⁵N,        ²H-enriched proteins, the condensed fermentation contains 1 g/L        ¹⁵N—NH₄Cl; 4 g/L ¹³C,²H-glucose; 50 mg/L α-¹³C-ketobutyric acid;        100 mg/L α-¹³C-ketoisovaleric acid. For labeling of Phe, Tyr,        and/or Trp [¹H,¹³C]-enriched sites, add 75 mg/L of ¹H,        ¹³C-enriched shikimic acid or 50 mg/L each of        isotopically-enriched Phe, Tyr, and/or Trp amino acids.        -   a. The degree of condensation allowing for optimal            expression varies among the different proteins expressed            (FIG. 5). 20-fold condensed (5% of initial volume) is a good            starting point; however, optimal fold condensation can be            worked out in small scale pilot experiments if it is            desired. Dilute a portion of the condensed culture to have            approximately 5 mL at 1× and induce separately for an            uncondensed control.        -   b. Uncondensed cultures should be induced with 0.2 μg/mL of            anhydrotetracycline. Culture condensation requires that the            concentration of anhydrotetracycline is increased linearly            up to 20× condensation. To calculate this, take the fold            condensation of the culture and multiply by 0.15 μg/mL of            anhydrotetracycline. For example, a 20× condensed culture            would require (20*0.15 μg/mL=3.0 μg/mL) of            anhydrotetracycline for induction.    -   (x) Incubate the condensed culture at 15° C. with shaking        overnight. Harvest the cell pellet by centrifuging (12,000×g, 5        min, 4° C.) and store them at −80° C.        General Protocol for Production of [¹H-¹³C]-I(δ1)LV, ¹³C, ¹⁵N,        ²H-enriched Proteins Using IPTG-Inducible SPP Vectors. Using        IPTG inducible pCold vectors, available from Takara Bioscience,        the protocol for production of ²H-enriched proteins is        essentially identical to that used for the        anhydrotetracycline-induced vectors, except that no        anhydrotetracycline is added in step (ix).        Preparation of [¹H-¹³C]-I(δ1)LV, ¹³C, ¹⁵N, ²H-CspA for        Structural Studies. Isotope-enriched samples of CspA were        prepared as described in the protocol above, with the following        details. Competent E. coli BL21(DE3) cells containing the        pACYCmazF plasmid were transformed with pColdI(SP-4) plasmid        (Takara Bioscience, Inc) containing ACA-less cspA cloned into        the Nde I-BamHI sites. The resulting constructs include a        16-residue N-terminal tag, consisting of a translation enhancing        element (TEE), a His₆ tag, and a Factor Xa cleavage site.        Protein expression was performed essentially as described above,        with the following details: single colonies were selected and        used to inoculate 2.5 mL LB medium at 37° C. for 6 hrs. 2 mL of        the LB culture was inoculated into 100 mL of MJ9 minimal medium        at 37° C. overnight. When OD₆₀₀ reached 1.8-2.0 units, the        culture was centrifuged at 3000×g for 15 min at 4° C. The cell        pellet was resuspended in 1 L of fresh MJ9 medium and cells were        grown at 37° C. until OD₆₀₀ reached 0.5. At this point the        culture was chilled on ice for 5 min and shifted to 15° C. for        45 mM to acclimate the cells to cold shock conditions. Target        protein (CspA) was then expressed along with MazF for 1.5 hrs by        addition of 1 mM isopropyl-β-D-thiogalactoside (IPTG) prior to        expression in isotope enriched medium. Cultures were then        centrifuged at 3000×g for 15 min at 4° C., resuspended in 2.5%        volume (40×) in deuterated (²H₂O) wash solution (7.0 g/L        Na₂HPO₄; 3.0 g/L KH₂PO₄; 0.5 g/L NaCl; pH 7.4)], centrifuged        again, and resuspended in 25 mL of deuterated MJ9 minimal medium        containing 1 g/L ¹⁵NH₄Cl; 4 g 1 L ¹³C,²H-glucose; 50 mg/L        α-¹³C-ketobutyric acid; 100 mg/L α-¹³C-ketoisovaleric acid; and        1 mM IPTG. Protein expression continued at 15° C. for 24 hrs.        Cells were harvested by centrifugation as described above and        cell pellets were stored at −80° C. For SDS-PAGE analysis, 100        μL of uncondensed cell culture (1×) or 2.5 μL of 40-fold        condensed cultures (40×) were analyzed by Coomassie Blue stain.        All isotopes were purchased from Cambridge Isotope Laboratories.        CspA Purification and Concentration. Cell pellets were        resuspended in 40 mL of lysis buffer [50 mM Na₂HPO₄—NaH₂PO₄; 300        mM NaCl; 5 mM imidazole; 5 mM 2-mercaptoethanol (βME); with 1        EDTA-free protease inhibitor tablet (Roche Cat. #11 873 580 001)        per 50 mL at pH 8.0] and sonicated to lyse the cells. Lysed        cells were then centrifuged at 4° C. for 1 hr at 16,000 rpm in a        Sorvall SS-34 rotor. Proteins were further purified by binding        to NiNTA agarose at 40 mL of soluble extract per 1 mL of bed        resin at 4° C. overnight. Resin was washed twice with 25 mL of        wash buffer [50 mM Na₂HPO₄—NaH₂PO₄; 300 mM NaCl; 25 mM        imidazole; 5 mM 2-mercaptoethanol (βME), pH 8.0], and protein        was eluted in 8 mL of elution buffer [50 mM Na₂HPO₄—NaH₂PO₄; 300        mM NaCl; 300 mM imidazole; 5 mM 2-mercaptoethanol (βME), pH        8.0]. The protein solution was then dialyzed overnight at 4° C.        into NMR buffer (20 mM Na₂HPO₄—NaH₂PO₄, 50 mM KCl, 0.02% NaN₃, 5        mM MgCl₂, pH 7.0 for EnvZB, and 50 mM KH₂PO₄, 1 mM NaN₃,        adjusted to pH 6.0 with 5 M KOH for CspA). Protein solutions        were then concentrated with Amicon Ultra 4 centrifugal filter        devices (5000 MWCO) by centrifugation (2500×g) to a volume of 1        mL. 3 mL of NMR buffer containing and 5% ²H₂O was added and        samples were again centrifuged to 1 mL. This process was        repeated and samples were further concentrated to a volume of        500 μL at a final concentration of 2-3 mg/mL (˜0.2 mM) for NMR        studies.        Expression of E. coli Plasma Membrane Protein YaiZ in cSPP        System. Uniformly ²H,¹³C,¹⁵N-enriched YaiZ was prepared        according to the protocol describe above, with the following        details. E. coli BL21 (DE3) cells, transformed with pACYCmazF        and pColdI(SP-4) plasmids harboring the ACA-less yaiZ target        gene, were grown in M9-glucose medium at 37° C. When OD₆₀₀        reached 0.5-0.6 units, the culture was chilled on ice for 5 mM        and shifted to 15° C. for 45 min in order to acclimate the cells        to cold temperature. After the cold-shock treatment, the        expression of both MazF and the target gene were induced with 1        mM isopropyl 13-D-1-thiogalactopyranoside (IPTG). The cell        pellet was washed twice with 10 mL of M9 salt buffer (no NH₄Cl)        in ²H₂O, and finally suspended in 50 mL of deuterated M9 minimal        medium (20× condensed phase) containing 1 g/L ¹⁵NH₄Cl; 4 g/L        ¹³C, ²H-glucose, and 1 mM IPTG. Protein production was induced        at 15° C. for ˜36 hr and the cells were then harvested by        centrifugation.        Expression of E. coli Outer Membrane Protein OmpX in cSPP        System. Uniformly ²H,¹³C,¹⁵N-enriched OmpX was prepared        according to the protocol describe above, with the following        details. An ACA-less gene coding for E. coli outer membrane        protein OmpX was cloned into pColdIV(SP-4). E. coli strain        BL21(DE3) was engineered to be ompA⁻ and ompF⁻ for two major        outer membrane proteins, a strategy aimed at enhancing target        outer membrane protein expression. These E. coli BL21(DE3) cells        were transformed with pACYCmazF and pColdI(SP-4) plasmids        harboring the target gene and grown in 2 L M9-glucose medium at        30° C. When OD₆₀₀ reached 0.5-0.6 units, the culture was chilled        on ice for 5 min and shifted to 15° C. for 45 mM in order to        acclimate the cells to cold temperature. After the cold-shock        treatment, the expression of both MazF and the target gene were        induced with 1 mM isopropyl 13-D-1-thiogalactopyranoside (IPTG)        for 45 min. The cells were then harvested by centrifugation at        3000× rpm for 30 min at 4° C. The cell pellet was washed twice        with 10 mL of M9 salt buffer (no NH₄Cl) in ²H₂O, and finally        suspended in 50 mL of deuterated M9 minimal medium (40×        condensed phase) containing 1 g/L ¹⁵NH₄Cl; 4 g/L ¹³C,        ²H-glucose, and 1 mM IPTG. Protein production was induced at        15° C. for ˜36 hr and the cells were then harvested by        centrifugation.        NMR Sample Preparation of YaiZ. The cell pellet provided by a        20× condensed culture originating as a 1 L fermentation was        suspended in 10 ml of 50 mM Tris buffer (pH 7.4). Cells were        then lysed by a French press at 15,000 psi. The membrane        fraction was collected by centrifugation at 100,000×g for 1 hr        at 4° C. The membrane pellet was resuspended in 1 ml of 50 mM        Tris buffer (pH 7.4) by sonication, centrifuged at 100,000×g for        1 hr at 4° C. and the membrane pellets were stored at −80° C.        The NMR sample of [²H,¹³C,¹⁵N]-YaiZ was prepared by simple        detergent solubilization of the plasma membrane fraction by 25        mM MES buffer, pH 6.0, containing 10% LOPG in 95% H₂O/5% ²H₂O.        The final concentration of YaiZ in the NMR sample was ˜0.2 mM.        NMR Sample Preparation of OmpX. The cell pellet provided by a        40× condensed culture originating as a 1 L fermentation was        suspended in 4 mL of 20 mM potassium phosphate buffer, pH 6.4,        containing 0.5% sodium lauryl sarcosinate (Sarkosyl) and        incubated at room temperature for 20 min, followed by        centrifugation at 135,000×g for 30 min at 4° C. The outer        membrane fraction was then isolated as a pellet¹⁰. The NMR        sample of [²H,¹³C,¹⁵]-OmpX was then prepared by simple detergent        extraction from the outer membrane fraction using 20 mM        potassium phosphate buffer, pH 6.4, containing 5% DPC and 5%        ²H₂O. The sample was briefly treated at about 80° C. to allow        extensive back-exchange of amide protons. The final        concentration of OmpX in the NMR sample was ˜0.2 mM.        Mass Spectrometry for Analysis of Isotope Incorporation.        Purified protein samples were run on a 15% SDS-PAGE followed by        Coomassie Blue staining. In-gel digest was performed in 50 mM        NH₄HCO₃, pH 7.9 overnight. Peptides were extracted from the gel        with 60% acetonitrile, 5% formic acid, and lyophilized. Samples        were then solubilized in 0.1% trifluoroacetic acid (TFA), pH        2.5-3.0, prior to liquid chromatography mass spectrometry/mass        spectrometery (LC-MS/MS) analysis. Chromatography was done using        an ultimate nano-LC system (Dionex/LC Packings) and a fritless        nanoscale column (75 μm×15 cm) packed in-house with 3 μm, 200 A        pore size Magic C18 stationary phase (Michrom Bioresource,        Auburn, Calif.). The column was equilibrated in 0.1% formic acid        (Solvent A), and samples were eluted using a linear gradient        from 2% to 45% solvent B (0.1% formic acid in acetonitrile) over        30 min at a flow rate of 250 mL/min and analyzed by an LTQ        linear ion trap mass spectrometer (ThermoFinnigan, San Jose,        Calif.) equipped with a nanospray source (Proxeon Biosystems).

To select peptides suitable for monitoring isotopic enrichment, thetryptic peptides were analyzed by MS/Zoom scan/MSMS scan. For EnvZB,peptide TISGTGLGLAIVQR (SEQ ID NO: 2) was selected, for CspA, peptideSLDEGQKVSFTIESGAK (SEQ ID NO: 1) was selected, and for MuLV IN NTD,peptide SHSPYYM_((oxidation))LNR (SEQ ID NO: 3) was selected.Quantitation of isotope incorporation was performed with MS in profilemode, by integration of peak areas. The percent of isotope incorporationwas determined by comparing the mass estimated from the isotope-enrichedpeptide mass distribution to the expected mass computed assuming 100%isotope (and/or methyl ¹H) incorporation. These EnvZB and CspA proteinswere expressed in ¹³C, ¹⁵N, ²H enriched medium with ¹³C-¹H labeledprecursors to the methyl groups of Ile(δ1), Leu, and Val. It istherefore expected that all of the carbon and nitrogen atoms present ina fully-enriched samples are ¹³C and ¹⁵N isotopic forms, the targetedmethyls are ¹H forms, and all of the rapidly-exchanging atoms willback-exchange to ¹H when purified in H₂O buffers. Therefore, all NH, SH,and OH groups, along with the methyl protons of Ile(δ1), Leu, and Valresidues, are assumed to be ¹H isotopes in the fully isotope-enrichedpeptide fragments. To specifically determine the percent of deuteriumincorporation and precursor incorporation independently, the MuLV IN NTDwas expressed in ²H₂O medium without protonated precursors for the aminoacids isoleucine, leucine, and valine. The peptideSHSPYYM_((oxidation))LNR (SEQ ID NO: 3) was then analyzed to determinethe efficiency of ²H incorporation.

NMR Spectroscopy. NMR measurements of uniformly ²H,¹³C,¹⁵N-enriched YaiZin 10% LOPG were obtained using a 800 MHz Bruker AVANCE spectrometerwith a cryoprobe at 40° C. TROSY [H^(N)-¹⁵N] HSQC NMR data were acquiredusing spectral widths of 14 ppm in ¹H dimension and 32 ppm in ¹⁵Ndimension. The matrix size of collected spectra was 1024×256 total datapoints. NMR measurements of uniformly ²H,¹³C,¹⁵N-enriched OmpX in 5% DPCwere obtained using a 600 MHz Bruker AVANCE spectrometer with acryoprobe at 50° C. TROSY [H^(N)-¹⁵N] HSQC NMR data were acquired usingspectral widths of 14 ppm in ¹H dimension and 34 ppm in ¹⁵N dimension.The matrix size of collected spectra was 1024×256 total data points. NMRspectra of CspA and EnvZB were recorded at 20° C. using an 800 MHzBruker AVANCE spectrometer with cryogenic probe, except where notedotherwise. Non-constant time [¹³C-¹H]-HSQC spectra were acquired forboth CspA and EnvZB with [¹H-¹³C]-I(δ1)LV, ¹³C, ¹⁵N, ²H-enrichment.Resonance assignments for [¹H-¹³C]-I(δ1)LV, ¹³C, ¹⁵N, ²H-enriched CspAwere determined using conventional triple resonance NMR experiments,including HNCO and deuterium-decoupled pulse sequences HN(ca)CO; HNCA;HN(co)CA; HNCACB and HN(co)CACB. The carrier positions were set to 118.0ppm for ¹⁵N, 176 ppm for ¹³CO, 54 ppm for ¹³C^(α) and 39 ppm for¹³C^(α)/¹³C^(β). Key parameters of data collection are summarized inTable 2.

TABLE 2 800 MHz Triple resonance data used for determining backboneresonance assignments. No. of points ¹⁵N-HSQC HNcoCA HNCO HNCA HNCACBHNcoCACB HNcaCO Collected 1024, 256 1024, 40, 50 1024, 40, 40 1024, 40,50 1024, 64, 100 1024, 64, 100 1024, 40, 40 After LP 1024, 512 1024, 72,82 1024, 72, 72 1024, 72, 82 1024,96, 164 1024, 96, 164 1024, 72, 72After 1024, 512 1024, 128, 128 1024, 128, 128 1024, 128, 128 1024, 128,256 1024, 128, 256 1024, 128, 128 zero filling No. of 8 4 4 4 16 16 16scans Spectral 14, 28 14, 23, 32 14, 23, 24 14, 23, 32 14, 28, 72 14,28, 72 14, 23, 24 width (ω₁, ω₂, ω₃; ppm) Recycle 1 1 1 1 1 1 1 delay(s) Collection 0.6 2.2 2.0 2.2 33.2 33.6 8.6 time (h)The total data collection time for all of these triple resonanceexperiments was about 3.5 days. In addition, 3D ¹³C-edited NOESY (mixingtime of 350 ms) and ¹⁵N-edited NOESY (mixing time of 175 ms) werecollected on a 600 MHz Bruker spectrometer with TXT probe. The matrixsizes of these spectra were 1024×32×220 total data points for ¹³C-editedNOESY, and 1024×64×256 total data points for ¹⁵N-edited NOESY. For¹³C-edited NOESY, the spectrum widths in ¹H, ¹³C and indirect detected¹H dimensions were set to 14 ppm, 16 ppm and 12 ppm respectively and thecarrier positions were set 4.7 ppm for ¹H and 16 ppm for ¹³C dimension.For ¹⁵N-edited NOESY, the spectrum widths in ¹H, ¹⁵N and indirectdetected ¹H dimensions were set to 14 ppm, 28 ppm and 11.5 ppmrespectively and the carrier positions were set 4.7 ppm for ¹H and 118ppm for ¹⁵N dimension. In all NMR experiments, FIDs were processed withlinear prediction and zero filling, and weighted by sine bell functionin all direct and indirect detected dimensions. All NMR spectra wereprocessed and examined with NMRPipe and NMRDraw software package. Theprogram SPARKY was used for data visualization and analysis. Chemicalshifts of proton were referenced to external DSS. ¹³C and ¹⁵N chemicalshifts were referenced indirectly based on the proton referencing.Analysis of Resonance Assignments. AutoAssign software was used forsemi-automated analysis of backbone and side chain ¹³C^(β) resonanceassignments for CspA. Peak list of [¹⁵N-¹H^(N)]-HSQC, and peak listsfrom the triple resonance experiments, including 3D HNCO; HN(ca)CO;HNCA; HN(co)CA; HNCACB and HN(co)CACB, were peak picked automaticallyusing the ‘restrictive peak picking’ function of the SPARKY software;these peak lists were manually refined prior to input into AutoAssignfor automated analysis of backbone resonance assignments. Side chain ¹³Cand ¹H methyl resonances of Leu, Val and Ile (δ1) were determinedsubsequently by interactive spectral analysis using [¹³C-¹H]-HSQC, 3D¹³C-edited NOESY, and 3D ¹⁵N-edited NOESY spectra.Generating Protein Structures Using Chemical-Shift Based ProteinStructure Prediction by ROSETTA (CS-ROSETTA). Chemical shiftinformation, including backbone ¹³C^(α), ¹⁵N, ¹³C′, ¹H^(N) and sidechain ¹³C^(β) assignments, were used as input for CS-ROSETTA. Three keysteps are involved. First, based on the chemical shift values andprotein sequences, peptide fragments were selected from a proteinstructure database using the MFR module of the NMRPipe software package.All proteins with PSI-BLAST e-score <0.05 with E. coli CspA were removedfrom the database. Second, a standard ROSETTA protocol was used for denovo structure generation. Third, ROSETTA all-atom models resulting fromthe above procedure were evaluated based on how well the predictedchemical shifts using SPARTA agree with the experimental chemicalshifts. If the lowest energy models cluster within less than ˜2 Å fromthe model with the lowest energy, the structure prediction is consideredsuccessful and lowest energy models are converged. A total of 10,000all-atom Rosetta models were generated from the MFR-selected peptidefragments, using a cluster of 20 CPUs. The 1,000 lowest-energy modelswere chosen and their all-atom ROSETTA energies were recalculated interms of the fitness with respect to the experimental chemical shiftvalues. The lowest energy models are converged based on the fact thatC^(α) rmsd values are less than ˜2 Å relative the lowest energy model.10 lowest energy models were selected as a representation of the 3Dstructure of CspA. The CS-ROSETTA package used in this work may bedownloaded fromhttp://spin.niddk.nih.gov/bax/software/CSROSETTA/indes.html.Conventional 3D Structure Calculations. Conventional 3D structurecalculations were performed using the AutoStructure software ver.2.2.1-CND for automated analysis of NOESY cross peak assignments,implemented together with the program CYANA ver. 2.1 for structuregeneration. The input for AutoStructure analysis consisted of (1) a listof backbone and ¹³C-¹H methyl side chain assignments; (2) manuallyedited NOESY peak lists, including chemical shift and peak heights,generated from ¹³C-edited and ¹⁵N-edited NOESY spectra; (3) locations ofslowly exchanging amide hydrogens based on published amide ¹H/²Hexchange data for CspA; (4) broad φ,ψ angle constraints (±40° and ±50°,respectively) derived from chemical shift data (after correction of ²Hisotope-shift effect) using the program TALOS. The best 10 of 56structures (lowest energy) from the final cycle of AutoStructure wererefined by restrained molecular dynamics in an explicit water bath usingCNS 1.1.

TABLE 3 Summary of Structural Statistics for E. coli CspA Structures^(a)Conven- Sparse- Sparse- tionally- constraint constraint determined CS-NMR NMR Rosetta Structure^(b) Structure^(c) Structure^(d)Conformationally-restricting constraints^(e) Distance constraints Total131 131 intra-residue (i = j) 17 17 sequential (|i − j| = 1) 45 45medium range 8 8 (1 < |i − j| ≦ 5) long range (|i − j| > 5) 61 61distance constraints per 2.0 2.0 residue Dihedral angle constrains 68 68Hydrogen bond constraints Total 22 22 long range (|i − j| > 5) 20 20Number of constraints per 3.3 3.3 residue Number of long range 1.2 1.2constraints per residue Residual constraint violations^(e) Averagenumber of distance violations per structure 0.1-0.2 Å 1.4 0.9 0.2-0.5 Å0 1.9 >0.5 Å 0 3.7 average RMS distance 0.02 0.17 violation/constraint(Å) maximum distance violation 0.18 1.74 (Å) Average number of dihedralangle violations per residue 1-10° 3.6 3 >10° 0 0.8 average RMS dihedralangle 0.45 1.73 violation/constraint (°) maximum dihedral angle 3.416.70 violation (°) RMSD from average coordinates (Å)^(e,f) backboneatoms 1.2 0.5 0.8 heavy atoms 1.7 1.1 1.2 RMSD from X-ray structure(Å)^(e,g) backbone atoms 1.58 ± 0.38 0.95 ± 0.11 0.52 ± 0.12 heavy atoms2.24 ± 0.34 1.63 ± 0.16 1.17 ± 0.11 Sidechain RMSD from X-ray structure(Å)^(e,h) heavy atoms 1.75 ± 0.20 1.59 ± 0.15 0.86 ± 0.11 heavysidechain atoms 1.81 ± 0.23 1.93 ± 0.22 1.14 ± 0.12 Ramachandranstatistics^(e,f) most favored regions (%) 92.0 78.3 93.7 additionalallowed regions 8.0 21.7 6.3 (%) generously allowed (%) 0.0 0.0 0.0disallowed regions (%) 0.0 0.0 0.0 Global quality Scores^(e) Raw/Z-scoreRaw/Z-score Raw/Z-score Verify 3D   0.33/−2.09   0.43/−0.48   0.45/−0.16Prosall   0.61/−0.17 0.77/0.50 0.85/0.83 Procheck (phi-psi)^(e)−0.49/−1.61 −1.37/−5.07 −0.28/−0.79 Procheck (all dihedrals)^(e)−0.42/−2.48 −1.47/−8.69 0.00/0.00 Molprobity clash score 15.22/−1.0964.74/−9.58 5.58/0.57 ^(a)Analysis for residues 1 to 70, excludingdisordered N-terminal expression tag. ^(b)Structure obtained from sparseNMR constraints. NMR structure determined by conventional methods (PDBid 3mef) ^(d)Structure obtained from CS-Rosetta structure generation,compared with constraints; note that these distance constraints were notused in generating the CS-Rosetta structure ^(e)Generated using PSVS 1.3program. Average distance violations were calculated using the sum overr⁻⁶. Note that the conformational constraints were not used inCS-Rosetta calculations except to validate the structure by providingthe statistics listed in this table. ^(f)Order residue ranges [S(phi) +S(psi) > 1.8]. NMR structure using minimum constraints: 4-24, 30-33,35-36, 45-46, 51-55, 63-64, 67-69; Conventionally-determined NMRstructure: 4-10, 20-23, 30-32, 48-51, 53-54, 68-69; CS-Rosetta generatedstructure: 4-27, 29-37, 40-60, 62-66. ^(g)Well-defined core region: 5-9,19-22, 50-56, 63-69. ^(h)Buried hydrophobic residues: V9, I21, V30, V32,I37, L45, V51, F53, A64, V67.

Results

In order to compare protein expression levels in deuterium-based minimalmedium under condensed culture conditions, two E. coli proteins, CspA(MW 9.4 kDa) and EnvZB (MW 19.5 kDa) were chosen for in depth analysis.Data for nine (9) additional proteins expressed in condensed cultureconditions (including membrane proteins, viral proteins, and eukaryoticproteins) are shown in (FIG. 5). FIG. 14 demonstrates the effect ofcondensation and IPTG-induction of these two ACA-less target genes underconditions of growth arrest and cessation of cellular proteinproduction. The levels of CspA and EnvZB produced per cell from a40-fold (40×) condensed H₂O-based culture is comparable to that of theuncondensed (1×) culture, where cells were treated identically differingonly in the volume of media used for resuspension during the proteinexpression induction phase (FIGS. 1 a and b, lanes 2 and 3). CspAprotein production is also unaffected by expression in 40× condensed²H₂O-based medium, while EnvZB displays only a modest reduction inexpression (FIGS. 1 a and b, lane 4). A second independent analysis ofEnvZB at 20× and 40× condensation showed no observable decrease inexpression in ²H₂O-based medium (FIG. 5, panel f). These datademonstrate that no acclimation of E. coli cells is required forpreparing perdeuterated proteins at high yield in condensed phase usingthe cSPP system.

Proteins were purified by a single step Ni-NTA affinity purification.The resulting isotope-enriched samples are extensively perdeuterated.FIG. 6 presents LTQ linear ion trap mass spectrometry (MS) data forrepresentative trypsin fragments from protein samples generated usingthe 40-fold condensed cSPP system, with uniform ²H,¹³C,¹⁵N-enrichmentand ¹H-¹³C labeling of Ile(δ1), Leu, and Val methyl groups (referred tohere as ILV-perdeuterated samples). The unlabeled peak present in eachMS spectrum, 10% for CspA and 20% for EnvZB, represents a portion of thetotal protein with no label incorporation. This unlabeled populationresults from target protein expression along with MazF production, priorto culture condensation and resuspension in isotope enriched medium.While this unlabeled fraction decreases the effective yield ofisotope-enriched protein, it is essentially invisible in the ¹⁵N and/or¹³C-edited NMR spectra, and its presence is not problematic for routinetriple-resonance NMR studies for protein structure determination.

Where required, the fraction of unlabeled protein can be reduced using adual control vector, in which MazF is under the control of theIPTG-inducible lac operator and the target protein is under the controlof the anhydrotetracycline-induced tet operator (tetO₁). In this system,MazF is induced prior to condensation and, only after they are in aquasi-dormant state, the cells are condensed into isotope-enrichedmedia, and the target protein is induced by anhydrotetracycline. Thisresults in much lower amounts (<3%) of unlabeled target protein (seeFIG. 7). The percent of overall isotope incorporated into the CspA andEnvZB proteins (i.e. in the isotope-enriched fractions indicated inFIGS. 6 and 7) is ˜85% using either the IPTG or dual IPTG/Tet induciblesystems. In the absence of protonated precursors for isoleucine,leucine, and valine, ²H incorporation also ranges from 85-100% (FIG. 8).

The quality of the [¹H-¹³C]-I(δ1)LV, ¹³C, ¹⁵N, ²H-enriched proteinsproduced with the IPTG system were further assessed by 2D [¹H-¹³C] HSQCNMR analysis (FIG. 9). These data demonstrate the expected ¹³C-¹H peaksof the Ile(δ1), Leu, and Val methyl groups in the upfield region of thespectrum, and little or no resonance peak intensity in other regions ofthe spectrum, demonstrating that these highly-perdeuterated proteinsamples are suitable for triple-resonance NMR studies. Similar resultsare obtained using CspA produced with the dual control IPTG/tet inducedsystem.

As an example of the utility of inexpensive production of perdeuteratedproteins using the cSPP system, the use of these samples for rapiddetermination of the 3D structure of an 86-residue construct of E. coliCspA was evaluated. A 0.2 mM sample of ILV-perdeuterated CspA (producedwith the IPTG system) was used for collection of deuterium-decoupledtriple resonance experiments, including HNCO, HN(ca)CO, HNCA, HN(co)CA,HNCACB, and HN(co)CACB experiments (Table 2), collected over a period of3.5 days at 800 MHz. FIG. 10 shows the excellent quality of theresulting 2D [¹H^(N)-¹⁵N]-HSQC spectrum. The program AutoAssign was thenused for automatic analysis of backbone H^(N), ¹⁵N, ¹³C^(α), ¹³C′, andsidechain ¹³C^(β) resonance assignments, as documented in FIG. 11. Theresulting resonance assignments are consistent with the publishedassignments for CspA (BMRB accession number 4296). Perdeuterated proteinsamples produced with the cSPP system thus provide high-quality NMRspectra suitable for rapid automated analysis of backbone resonanceassignments.

As a further example of the utility of perdeuterated samples producedwith the cSPP system, rapid analysis of the 3D structure of[¹H-¹³C]-I(δ1)LV, ¹³C, ¹⁵N, ²H-enriched CspA without the need forsidechain resonance assignments was also demonstrated. The use of suchILV-perdeuterated samples for fully automated analysis of small proteinfolds using sparse NOESY data has been previously advocated. However,the recently introduced CS-Rosetta method provides an alternativeapproach for small protein structure analysis using only backbone and¹³C^(β) chemical shift data. CS-Rosetta calculations were carried outusing these resonance assignments determined with the perdeuterated CspAsample. The resulting ensemble of 10 structures, shown in FIG. 8,exhibits good structure quality scores (Table 3), and is in excellentagreement with the X-ray crystal structure of CspA, with backbone rmsdof 0.5 Å and all atom rmsd of 1.2 Å to the crystal structure forwell-defined regions of the CS-Rosetta structure (1.1 Å rmsd to crystalstructure for core, non-solvent-exposed sidechain atoms). In order tofurther assess the value of using CS-Rosetta to obtain an accuratestructure of a perdeuterated protein, additional 3D ¹⁵N-edited NOESY and3D ¹³C-edited NOESY data were acquired and used to assign side-chainmethyl resonances, so as to determine the 3D structure by conventionalautomated methods with energy refinement. The resulting NMR ensemble(FIG. 12) is similar to the CS-Rosetta structure, but has lowerprecision, lower structure quality scores, and is less similar to thecrystal structure (Table 3). Comparison of the CS-Rosetta structure withthe resulting NOESY constraint list (also summarized in statistics ofTable 3, reveals that essentially all subsequently-derived NOE-baseddistance constraints are satisfied by each of the 10 CS-Rosettastructures, cross-validating the high accuracy of the CS-Rosettastructure. These results demonstrate a cost-effective approach forrapidly determining high-quality small protein NMR structures withaccuracies rivaling structures determined using more extensive NMRmethods with full side-chain assignments; however nothing substitutesobtaining a complete data set to get the most accurate structures.

Perdeuteration is essential for solution NMR studies of membraneproteins, and can be achieved at a fraction of the cost of conventionalmethods by using the cSPP system. As a second example of the utility ofthe cSPP system for cost-effective production of larger perdeuteratedproteins, the production of perdeuterated membrane proteins, includingthe 8 kDa, 70-residue E. coli plasma membrane protein YaiZ, and the 16kDa, 148-residue E. coli protein OmpX outer membrane protein, have alsobeen demonstrated. The NMR sample of YaiZ was expressed using 20×condensation and OmpX was expressed using 40× condensation inperdeuterated media. Both YaiZ and OmpX were expressed at high levels(FIG. 5). YaiZ was found exclusively in the inner membrane fraction, butnot in the outer membrane fraction and inclusion bodies. An NMR sampleof YaiZ was prepared by detergent extraction directly from membranefractions using 10%1-oleoyl-2-hydroxy-sn-glycero-3-[phospho-RAC-(1-glycerol)] (LOPG). OmpXassembles into the outer membrane (FIG. 13), rather than as inclusionbodies as typically observed for outer membrane proteins. An NMR sampleof OmpX sample was prepared by solubilizing the plasma membrane fractionwith 0.5% sodium lauryl sarcosinate (Sarkosyl)²⁰, separating theinsoluble outer membrane fraction, and then extracting OmpX into 5%dodecylphophocholine (DPC) detergent.

These ²H,¹³C,¹⁵N-enriched YaiZ and OmpX samples in simple detergentextracts were studied using [H^(N),¹⁵N] TROSY NMR at 800 and 600 MHz,respectively (FIGS. 15 a and b). The YaiZ and OmpX spectra exhibitessentially of the amide peaks expected. The distribution of amide peaksin the OmpX spectrum is similar, but not identical, to that reported forreconstituted OmpX samples¹⁹. While further purification does improvethe spectra, these samples of YaiZ in a plasma membrane detergentextract and OmpX in a outer membrane detergent extract are suitable forsome structural and functional studies by NMR without any furtherpurification; however further purification of these ²H-enriched sampleswould be recommended for extensive NMR structural studies orcrystallization. These results demonstrate the utility of the cSPPsystem for production of perdeuterated membrane proteins.

Discussion

These results demonstrate an important new technology for producingperdeuterated proteins for NMR and other biophysical studies. The cSPPsystem can be used to produce perdeuterated, ¹³C,¹⁵N-enriched proteinsamples, with methyl, aromatic, or other ¹H-¹³C labeling patterns, asrequired, at a fraction of the cost of conventional systems with similarprotein production yields. These methods have applications for studiesof larger proteins, protein-protein complexes, and membrane proteins.

The most important feature of the SPP method is that cells are no longergrowing following MazF induction. Therefore, the only protein beingproduced in the “semi-dormant state” is the target protein—i.e. all ofthe ¹³C,¹⁵N isotope is incorporated into the target protein, and noneinto other proteins, providing the ability to selectively detect thetarget protein using ¹³C and/or ¹⁵N-edited NMR experiments. Usingconventional methods, these isotopes would be incorporated into manydifferent proteins—i.e. the labeling would not be selective to thetargeted protein. In addition, in the quiescent state the cells aredormant, with minimal recycling of unlabeled amino acids into newprotein synthesis. The cSPP method can also be used in some cases tocondense cells to very high densities (e.g. 40× or higher) inisotope-enriched media with no significant reduction in proteinproduction yields per cell. Thus, the cSPP method has significantadvantages over simply resuspending cells at higher concentrations inlabeled media.

Routine access to inexpensive perdeuteration methods also provides aroute to fully automated analysis of backbone assignments and 3Dstructures of smaller (<12 kDa) proteins. The studies discussed hereindemonstrate cost-effective production of ²H,¹³C,¹⁵N-enriched proteins inthe cSPP system. The resulting sample of CspA, purified with asingle-step Ni-NTA affinity chromatography, allowed data collection andautomated analysis of backbone ¹H^(N), ¹⁵N, ¹³C^(α), ¹³C′, as well assidechain ¹³C^(β), assignments in only a few days. These data for aperdeuterated protein provided high-quality 3D structures of CspA usingCS-Rosetta, rivaling the best NMR structures available to date for CspAusing conventional methods requiring essentially complete sidechainproton assignments. Sparse NOESY data (e.g. H^(N)-H^(N) or H^(N)—CH₃NOEs) can also be used, as necessary, to refine the CS-Rosettastructure. This represents a general approach to rapid structureanalysis of small (<150 residue) proteins. This approach has tremendousvalue in generating assignments and structural information for smallmolecule screening studies, as well as in high-throughput structural andfunctional genomics studies.

In the case of membrane proteins, the selective labeling provided by thesingle protein production system allows NMR studies of a targetedmembrane protein following simple detergent extraction. NMR spectra canbe obtained without extensive purification or reconstitution, allowing amembrane protein's structural and functional properties to becharacterized prior to reconstitution, or as a probe of the effects ofsubsequent purification steps on the structural integrity of membraneproteins.

In conclusion, the condensed-phase SPP technology for protein productionallows cost-effective production of milligram quantities ofperdeuterated proteins, facilitating many NMR, neutron diffraction, andother biophysical applications. The current technology typicallyprovides samples that are ˜85% enriched in ²H, and 100% enriched in ¹⁵Nand ¹³C. As demonstrated here for CspA, this labeling is sufficient forrapid analysis of resonance assignments and 3D structures of smallsoluble proteins and for production of valuable perdeuterated samples oflarger soluble proteins and membrane proteins. The 10-40 fold reductionin costs of perdeuterated fermentation media provided by using the cSPPsystem opens the door to many new applications for perdeuteratedproteins in spectroscopic and crystallographic studies.

1. A vector comprising: a. a cspA cold shock promoter; b. a tetR gene;c. a tet operator; and d. a gene encoding a target protein under thecontrol of the tet operator.
 2. The vector of claim 1, furthercomprising a sequence encoding at least one fusion tag.
 3. The vector ofclaim 2, wherein the fusion tag increases translational efficiency, aidsin purification of the target protein, or both.
 4. The vector of claim2, wherein the fusion tag is a translational enhancing element or a His₆tag.
 5. The vector of claim 2, wherein the fusion tag sequence encodesan amino acid sequence selected from the group consisting of SEQ ID NO:4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,SEQ ID NO: 10 and SEQ ID NO:
 12. 6. The vector of claim 4, wherein thefusion tag comprises a His₆ tag, further comprising a Factor Xa proteasecleavage site or a TEV protease cleavage site.
 7. The vector of claim 1,wherein the target protein is capable of being induced by tetracyclineor an analog thereof.
 8. The vector of claim 1, wherein the targetprotein is ACA-less.
 9. The vector of claim 1, further comprising a geneencoding an OmpA signal peptide.
 10. The vector of claim 1, wherein thegene encoding the target protein is sub-cloned into the vector betweenNdeI and BamHI restriction sites.
 11. A cell containing: a. a vectorcomprising a gene encoding a target protein; and b. a vector comprisinga gene encoding an mRNA specific endoribonuclease, wherein the targetprotein and mRNA-specific endoribonuclease are capable of being inducedwith different substances.
 12. The cell of claim 11, wherein the mRNAsassociated with the expression of the target protein are do not containthe sequence at which the mRNA-specific endoribonuclease cleaves. 13.The cell of claim 11, wherein the mRNA-specific endoribonuclease isMazF.
 14. The cell of claim 11, wherein the mRNA-specificendoribonuclease is capable of being induced by IPTG.
 15. The cell ofclaim 11, wherein the target protein is capable of being induced withtetracycline or an analog thereof.
 16. The cell of claim 11, wherein themRNA-specific endoribonuclease is capable of being induced by3-β-indoleacrylic acid (IAA), L-arabinose, or L-rhamnose.
 17. The cellof claim 11, wherein the target protein is capable of being induced by3-β-indoleacrylic acid (IAA), L-arabinose, or L-rhamnose.
 18. A methodof labeling a target protein comprising: a. contacting a culture ofcells of claim 11 with a substance capable of inducing the mRNA-specificendoribonuclease; and b. contacting a culture of the cells of claim 11with an isotope-enriched medium comprising a substance capable ofinducing the target protein.
 19. The method of claim 18, furthercomprising condensing the culture to a desired culture volume.
 20. Themethod of claim 19, wherein the substance capable of inducing the targetprotein is anhydrotetracycline, and wherein the volume ofanhydrotetracycline present is between about 0.1 μg/mL and about 0.2μg/mL multiplied by the degree of condensation.
 21. The method of claim19, wherein the substance capable of inducing the target protein isanhydrotetracycline, and wherein the volume of anhydrotetracyclinepresent is about 0.15 μg/mL.
 22. The method of claim 18, wherein atleast 80% of the target protein is labeled.
 23. The method of claim 18,wherein at least 85% of the target protein is labeled.
 24. The method ofclaim 18, wherein at least 90% of the target protein is labeled.
 25. Themethod of claim 18, wherein at least 95% of the target protein islabeled.
 26. The method of claim 18, wherein the substance capable ofinducing the target protein is tetracycline or an analog thereof. 27.The method of claim 18, wherein the substance capable of inducing thetarget protein is anhydrotetracycline.
 28. The method of claim 18,wherein the substance capable of inducing the target protein is3-β-indoleacrylic acid (IAA), L-arabinose, or L-rhamnose.
 29. The methodof claim 18, wherein the substance capable of inducing the mRNA-specificendoribonuclease is 3-β-indoleacrylic acid (IAA), L-arabinose, orL-rhamnose.
 30. A protein labeled by the method of claim
 18. 31. Theprotein of claim 30, comprising at least one fusion tag.
 32. The proteinof claim 31, wherein the fusion tag increases translational efficiency,aids in purification of the target protein, or both.
 33. The protein ofclaim 31, wherein the fusion tag is selected from the group consistingof a translational enhancing element and a His₆ tag.